*NOTE: The data variable named "Protected Heritage Area" refers to the 35 different Canadian National Parks included in the data. In all plot titles and labels, I refer to them as "Canadian National Parks" to assist in clarity as "Protected Heritage Area" can be a bit vague/unclear as to what it refers to.
First, I import all the packages I'll be using throughout this file:
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import *
from matplotlib.pyplot import figure
import numpy as np
import matplotlib.colors as mcolors
import calendar
from datetime import datetime
#Used to view long lists of output if needed
#pd.set_option('display.height', 1000)
#pd.set_option('display.max_rows', 500)
#pd.set_option('display.max_columns', 500)
#pd.set_option('display.width', 1000)
Next, I import my dataset and set the datatypes as appropriate:
Complete_HWC_Data = pd.read_csv("/Users/nerdbear/Downloads/Complete_HWC_Data.csv", index_col=0, dtype=str, encoding='utf8')
Complete_HWC_Data[Complete_HWC_Data.columns[0:20]] = Complete_HWC_Data[Complete_HWC_Data.columns[0:20]].astype("str")
Complete_HWC_Data["Sum of Number of Animals"] = Complete_HWC_Data["Sum of Number of Animals"].astype("float")
Complete_HWC_Data["Total Staff Hours"] = Complete_HWC_Data["Total Staff Hours"].astype("float")
Complete_HWC_Data["Total Staff Involved"] = Complete_HWC_Data["Total Staff Involved"].astype("float")
Complete_HWC_Data["Latitude Public"] = Complete_HWC_Data["Latitude Public"].astype("float")
Complete_HWC_Data["Longitude Public"] = Complete_HWC_Data["Longitude Public"].astype("float")
#Complete_HWC_Data["Species Common Name"] = Complete_HWC_Data["Species Common Name"].astype("str")
Complete_HWC_Data[Complete_HWC_Data.columns[20:158]] = Complete_HWC_Data[Complete_HWC_Data.columns[20:170]].astype("float")
#Complete_HWC_Data["Animal Health Status"] = Complete_HWC_Data["Animal Health Status"].astype("str")
#Complete_HWC_Data["Cause of Animal Health Status"] = Complete_HWC_Data["Cause of Animal Health Status"].astype("str")
Complete_HWC_Data.head()
| UniqueID | Incident Number | Incident Date | Field Unit | Protected Heritage Area | Incident Type | Latitude Public | Longitude Public | Within Park | Total Staff Involved | Total Staff Hours | Species Common Name | Sum of Number of Animals | Animal Health Status | Cause of Animal Health Status | Animal Behaviour | Reason for Animal Behaviour | Animal Attractant | Deterrents Used | Animal Response to Deterrents | Activity Type_Backpacking – Multiday Trips | Activity Type_Beach Recreation | Activity Type_Boating - Coastal/Marine | Activity Type_Boating - Commercial | Activity Type_Boating - Motorized Pleasure Craft | Activity Type_Bush Party | Activity Type_Camping - Backcountry | Activity Type_Camping - Frontcountry | Activity Type_Camping - Huts and Lodges | Activity Type_Camping - Winter Frontcountry | Activity Type_Canoeing - Flatwater | Activity Type_Canoeing - Swiftwater | Activity Type_Canyoneering | Activity Type_Climbing - Mountaineering | Activity Type_Climbing - Technical Rock | Activity Type_Climbing - Waterfall Ice | Activity Type_Commercial Transportation Operation | Activity Type_Cycling | Activity Type_Cycling - Mountain Biking | Activity Type_Cycling - Road/Shared Path | Activity Type_Cycling - Winter | Activity Type_Dog Walking | Activity Type_Dogsledding | Activity Type_Domestic Residence Activity | Activity Type_Driving | Activity Type_Field Sports | Activity Type_Fishing | Activity Type_Flight - HETS | Activity Type_Flight - Hang-gliding/Parapenting | Activity Type_Flight - Helicopter | Activity Type_Flight - Sightseeing/Site Access | Activity Type_Golfing | Activity Type_Heritage Activity - Bird Watching | Activity Type_Heritage Activity - History Activities | Activity Type_Heritage Activity - Photography and Art | Activity Type_Heritage Activity - Sightseeing | Activity Type_Heritage Activity - Wildlife Observation | Activity Type_Hiking / Walking | Activity Type_Horse Riding - Day Trip | Activity Type_Horse Riding - Multiday | Activity Type_Ice Skating | Activity Type_Kayaking - Coastal | Activity Type_Kayaking - Flatwater | Activity Type_Kayaking - Swiftwater | Activity Type_Mooring | Activity Type_Not Applicable | Activity Type_Orienteering / Geocaching | Activity Type_Other | Activity Type_Paddleboarding - Coastal | Activity Type_Paddleboarding - Flatwater | Activity Type_Park Operations | Activity Type_Park Ops - Avalanche Forecasting | Activity Type_Park Ops - Avalanche Control | Activity Type_Park Ops - Search and Rescue | Activity Type_Park Ops - Training | Activity Type_Picnicking / BBQ | Activity Type_Playground Activities | Activity Type_Rafting - Flatwater | Activity Type_Rafting - Swiftwater | Activity Type_Railway | Activity Type_Research - Scientific/Social | Activity Type_Resource Harvesting - Hunting/Fishing/Gathering/Trapping | Activity Type_Roller Sports | Activity Type_Running - Road | Activity Type_Running - Trail | Activity Type_Sail Sports - Wind / Kite Surfing | Activity Type_Scrambling | Activity Type_Skiing - Crosscountry | Activity Type_Skiing/Boarding - Backcountry | Activity Type_Skiing/Boarding - Ski Resort In Bounds | Activity Type_Skiing/Boarding - Ski Resort Out of Bounds | Activity Type_Sledding/Tobogganning | Activity Type_Snowmobiling | Activity Type_Snowshoeing | Activity Type_Special Event - Participative Audience | Activity Type_Special Events - Passive Audience | Activity Type_Stakeholder Operations | Activity Type_Surfing | Activity Type_Swimming - Cliff Jumping | Activity Type_Swimming - Coastal | Activity Type_Swimming - Facilities | Activity Type_Swimming - Flat Water | Activity Type_Swimming - Swiftwater | Activity Type_Townsite Activity | Activity Type_Tram/Ski Lift/Gondola | Activity Type_Tubing / River Drifting | Activity Type_Unknown | Activity Type_Via-Ferrata | Activity Type_nan | Response Type_ | Response Type_Assist Visitor | Response Type_Assist other Agency | Response Type_Assist other Field Unit | Response Type_Attractant Management | Response Type_Aversive Conditioning | Response Type_Capture and transport to captivity | Response Type_Clean Up | Response Type_Close Area | Response Type_Close Road | Response Type_Collar | Response Type_Collect Sample | Response Type_Cull | Response Type_Destroy Animal | Response Type_Disentangle | Response Type_Dispatch other Agency | Response Type_Disperse Wildlife Jam | Response Type_Dispose Carcass | Response Type_Ear Tag | Response Type_Euthanize | Response Type_Evacuate Visitor | Response Type_Haze - Hard | Response Type_Haze - Soft | Response Type_Immobilize Animal | Response Type_Inform Visitor | Response Type_Infrastructure modification | Response Type_Investigate Incident | Response Type_Issue Prohibited Activity Order | Response Type_Issue Restricted Activity Order | Response Type_Issue Stop Work Order | Response Type_Leave on Landscape | Response Type_Mark - paint | Response Type_Monitor - Camera | Response Type_Monitor - patrol | Response Type_Monitor - visitor and staff sighting | Response Type_Necropsy | Response Type_No response required | Response Type_Not Applicable | Response Type_Refer incident to other agency | Response Type_Rehabilitate area | Response Type_Relocate animal (s) | Response Type_Request assistance - other Agency | Response Type_Request assistance - police | Response Type_Traffic control | Response Type_Translocate | Response Type_Trap or snare | Response Type_Unable to respond | Response Type_Warning signs | Response Type_nan | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | BAN2010-0003.3 | BAN2010-0003 | 2010-01-01 | Banff Field Unit | Banff National Park of Canada | Human Wildlife Interaction | 51.161093 | -115.593386 | Yes | 1.0 | 2.33 | Coyote | 2.0 | Healthy | nan | Avoidance | Surprise | Prey animal (natural) | Presence of Officer/Person | nan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 1 | BAN2010-0003.2 | BAN2010-0003 | 2010-01-01 | Banff Field Unit | Banff National Park of Canada | Human Wildlife Interaction | 51.161093 | -115.593386 | Yes | 1.0 | 2.33 | Elk | 1.0 | Dead | Predation | nan | nan | nan | nan | nan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 2 | BAN2010-0003.1 | BAN2010-0003 | 2010-01-01 | Banff Field Unit | Banff National Park of Canada | Human Wildlife Interaction | 51.161093 | -115.593386 | Yes | 1.0 | 2.33 | Wolf | 3.0 | Not Located | nan | nan | nan | Prey animal (natural) | nan | nan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 3 | JNP2010-0011.1 | JNP2010-0011 | 2010-01-01 | Jasper Field Unit | Jasper National Park of Canada | Rescued/Recovered/Found Wildlife | 53.139120 | -117.964219 | Yes | 1.0 | 1.00 | White-tailed Deer | 1.0 | Dead | Collision | nan | nan | nan | nan | nan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4 | JNP2010-0015.1 | JNP2010-0015 | 2010-01-01 | Jasper Field Unit | Jasper National Park of Canada | Attractant | 53.050492 | -118.073612 | Yes | 1.0 | 2.50 | None | 0.0 | nan | nan | nan | nan | Grain | nan | nan | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
Setting up variables here that are used in plotting below:
Complete_HWC_Data_Month = Complete_HWC_Data.loc[:, ("Incident Type", "Field Unit", "Incident Date")]
Complete_HWC_Data_Month["Incident Month"] = pd.to_datetime(Complete_HWC_Data_Month["Incident Date"]).dt.to_period("M")
Complete_HWC_Data_Month["Incident Month"] = Complete_HWC_Data_Month["Incident Month"].astype("str")
Complete_HWC_Data_Month["Incident Month"] = pd.to_datetime(Complete_HWC_Data_Month["Incident Month"])
#Complete_HWC_Data_Month
Complete_HWC_Data_Month = Complete_HWC_Data_Month.set_index("Incident Date")
Complete_HWC_Data_Month["Incident Month"] = pd.DatetimeIndex(Complete_HWC_Data_Month.index).month
#Complete_HWC_Data_Month
Complete_HWC_Data_Month = Complete_HWC_Data_Month.replace({'Incident Month' : { 1 : "January", 2 : "February", 3 : "March", 4 : "April", 5 : "May", 6 : "June", 7 : "July", 8 : "August", 9 : "September", 10 : "October", 11 : "November", 12 : "December"}})
Complete_HWC_Data_Month["Incident Month"] = Complete_HWC_Data_Month["Incident Month"].astype('category')
Complete_HWC_Data_Month["Incident Month"] = Complete_HWC_Data_Month["Incident Month"].cat.reorder_categories(["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"], ordered=True)
Complete_HWC_Data_Month["Incident Month"].cat.categories
IncidentsByMonth = Complete_HWC_Data_Month["Incident Type"].groupby([Complete_HWC_Data_Month["Incident Month"], Complete_HWC_Data_Month["Incident Type"]]).count().reset_index(name="count")
IncidentsByMonth = IncidentsByMonth.pivot_table("count", "Incident Month", "Incident Type")
#IncidentsByMonth
Complete_HWC_Data_Year = Complete_HWC_Data.loc[:, ("Incident Type", "Field Unit", "Incident Date", "Protected Heritage Area")]
Complete_HWC_Data_Year["Incident Year"] = pd.to_datetime(Complete_HWC_Data_Year["Incident Date"]).dt.to_period("Y")
Complete_HWC_Data_Year["Incident Year"] = Complete_HWC_Data_Year["Incident Year"].astype("str")
Complete_HWC_Data_Year["Incident Year"] = pd.to_datetime(Complete_HWC_Data_Year["Incident Year"])
#Complete_HWC_Data_Year
IncidentsByYear = Complete_HWC_Data_Year["Incident Type"].groupby([Complete_HWC_Data_Year["Incident Year"], Complete_HWC_Data_Year["Incident Type"]]).count().reset_index(name="count")
#IncidentsByYear
IncidentsByYear = IncidentsByYear.pivot_table("count", "Incident Year", "Incident Type")
#IncidentsByYear
HighIncParks = IncidentsByPark.loc[IncidentsByPark["count"] >1000]
#HighIncParks
Parks = Complete_HWC_Data_Year["Protected Heritage Area"].unique()
#Parks
IncidentsByPark = Complete_HWC_Data_Year["Protected Heritage Area"].groupby([Complete_HWC_Data_Year["Protected Heritage Area"]]).count().reset_index(name="count")
IncidentsByPark = IncidentsByPark.pivot_table("count", "Protected Heritage Area")
#IncidentsByPark.index
IncidentsByPark=IncidentsByPark.sort_values(by=['count'])
#IncidentsByPark
IncidentsByYearByPark = Complete_HWC_Data_Year["Protected Heritage Area"].groupby([Complete_HWC_Data_Year["Incident Year"], Complete_HWC_Data_Year["Protected Heritage Area"]]).count().reset_index(name="count")
IncidentsByYearByPark = IncidentsByYearByPark.pivot_table("count", "Incident Year", "Protected Heritage Area")
#IncidentsByYearByPark
IncidentsByTypeByPark = Complete_HWC_Data_Year["Incident Type"].groupby([Complete_HWC_Data_Year["Protected Heritage Area"], Complete_HWC_Data_Year["Incident Type"]]).count().reset_index(name="count")
IncidentsByTypeByPark = IncidentsByTypeByPark.sort_values(by="count")
IncidentsByTypeByPark = IncidentsByTypeByPark.sort_values(by="Protected Heritage Area")
IncidentsByPark=IncidentsByPark.sort_values(by=['count',])
#IncidentsByTypeByPark
IncidentsByTypeByPark = IncidentsByTypeByPark.pivot_table("count", "Protected Heritage Area","Incident Type",)
#IncidentsByTypeByPark
##
#The for loop I use below doesn't function how I wanted - and I ran out of time to get it
#to do exactly what I wanted it to do (count frequency of Incident Types by Species).
#I'm still using the output of the loop as it does provide counts by species, and
#can be used to plot only the species that are included in more than 100 incidents.
#If I can get the for loop working properly later on, I'll be able to plot the Species
#and Incident Types better. ValueCounts = Complete_HWC_Data["Species Common Name"].value_counts()
SpeciesData = Complete_HWC_Data.loc[:, ("Incident Type", "Species Common Name")]
Counts = []
for i in Complete_HWC_Data["Species Common Name"]:
Counts.append(ValueCounts[i])
SpeciesData.insert(0, "Species_Counts", Counts)
#SpeciesData
HighSpeciesData = SpeciesData.loc[SpeciesData["Species_Counts"] > 100]
HighSpeciesData=HighSpeciesData.sort_values(by=['Species_Counts'])
HighSpeciesCount = HighSpeciesData["Species Common Name"].unique()
#HighSpeciesCount
###
IncBySpecies = Complete_HWC_Data.loc[:, ("Incident Type", "Species Common Name")]
IncBySpecies = IncBySpecies["Incident Type"].groupby([IncBySpecies["Species Common Name"], IncBySpecies["Incident Type"]]).count().reset_index(name="count")
IncBySpecies = IncBySpecies.set_index("Incident Type")
#Complete_HWC_Data["Incident Type"].loc[Complete_HWC_Data["Species Common Name"]=="Black Bear"]
IncBySpecies = IncBySpecies.pivot_table("count", "Incident Type", "Species Common Name")
#IncBySpecies
HealthBySpecies = Complete_HWC_Data.loc[:, ("Animal Health Status", "Species Common Name")]
HealthBySpecies = HealthBySpecies["Animal Health Status"].groupby([HealthBySpecies["Species Common Name"], HealthBySpecies["Animal Health Status"]]).count().reset_index(name="count")
HealthBySpecies = HealthBySpecies.loc[HealthBySpecies["Animal Health Status"] != "Not Applicable"]
HealthBySpecies = HealthBySpecies.set_index("Animal Health Status")
#Complete_HWC_Data["Incident Type"].loc[Complete_HWC_Data["Species Common Name"]=="Black Bear"]
HealthBySpecies = HealthBySpecies.pivot_table("count", "Animal Health Status", "Species Common Name")
#HealthBySpecies
SpeciesByHealth = Complete_HWC_Data.loc[:, ("Animal Health Status", "Species Common Name")]
SpeciesByHealth = SpeciesByHealth["Animal Health Status"].groupby([SpeciesByHealth["Species Common Name"], SpeciesByHealth["Animal Health Status"]]).count().reset_index(name="count")
SpeciesByHealth = SpeciesByHealth.loc[SpeciesByHealth["Animal Health Status"] != "Not Applicable"]
SpeciesByHealth = SpeciesByHealth.set_index("Species Common Name")
SpeciesByHealth = SpeciesByHealth.pivot_table("count","Species Common Name", "Animal Health Status")
#SpeciesByHealth
plt.figure(figsize=(10,5));
plt.hist(Complete_HWC_Data["Field Unit"], width = 0.8, bins = 19, color = "green")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Field Unit', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Field Unit', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Protected Heritage Area"], width = 0.8, bins = 35, color = "lightseagreen")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Canadian National Park', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Canadian National Park', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Incident Type"], width = 0.8, bins = 9, color = "cornflowerblue")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Incident Type', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Incident Type', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Latitude Public"],width=2, bins=15, color = "goldenrod")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Latitude', fontweight="bold", size = 18);
plt.title('Total Number of Incidents by Latitude', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Longitude Public"], width=2, bins=15, color = "tomato")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Longitude', fontweight="bold", size = 18);
plt.title('Total Number of Incidents by Longitude', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Animal Health Status"], width = 0.8, bins = 9, color="olivedrab")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Health Status', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Health Status', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,10));
plt.hist(Complete_HWC_Data["Cause of Animal Health Status"], width = 0.8, bins = 18, color="lightsteelblue")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Cause of Animal Health Status', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Cause of Animal Health Status', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Cause of Animal Health Status"].loc[Complete_HWC_Data["Cause of Animal Health Status"] != "nan"], width = 0.8, bins = 18, color="lightsteelblue")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Cause of Animal Health Status', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Cause of Animal Health Status (Missing Values Removed)', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Animal Behaviour"], width = 0.8, bins = 22, color="tan")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Behaviour', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Behaviour', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Animal Behaviour"].loc[Complete_HWC_Data["Animal Behaviour"] != "nan"], width = 0.8, bins = 22, color="tan")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Behaviour', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Behaviour (Missing Values Removed)', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Reason for Animal Behaviour"], width = 0.8, bins = 17, color="palegreen")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Reason for Animal Behaviour', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Reason for Animal Behaviour', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Reason for Animal Behaviour"].loc[Complete_HWC_Data["Reason for Animal Behaviour"] != "nan"], width = 0.8, bins = 16, color="palegreen")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Reason for Animal Behaviour', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Reason for Animal Behaviour (Missing Values Removed)', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Animal Attractant"], width = 0.8, bins = 20, color="palevioletred")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Attractant', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Attractant', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Animal Attractant"].loc[Complete_HWC_Data["Animal Attractant"] != "nan"], width = 0.8, bins = 19, color="palevioletred")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Attractant', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Attractant (Missing Values Removed)', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Deterrents Used"], width = 0.8, bins = 26, color="plum")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Deterrents Used', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Deterrents Used', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(10,7));
plt.hist(Complete_HWC_Data["Deterrents Used"].loc[Complete_HWC_Data["Deterrents Used"] != "nan"], width = 0.8, bins = 25, color="plum")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Deterrents Used', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Deterrents Used (Missing Values Removed)', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
plt.figure(figsize=(12,7));
plt.hist(Complete_HWC_Data["Animal Response to Deterrents"], width = 0.8, bins = 11, color="lightskyblue")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18);
plt.xlabel('Animal Response to Deterrents', fontweight="bold", size = 18);
plt.title('Total Number of Incidents per Animal Response to Deterrents', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
Notes:
plt.figure(figsize=(20,10));
plt.hist(HighSpeciesData["Species Common Name"], width = 0.9, bins = 36, color="lightcoral")
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Species', fontweight="bold", size = 18);
plt.title('Total Number of Incidents by Species Common Name', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.show()
Observations from histograms:
While it appears as though the “Field Unit” and “Protected Heritage Area” (i.e. Canadian National Park plot) variables overlap, both are interesting to look at. “Protected Heritage Area” directly reflects the 35 Canadian National Parks and “Field Unit” reflects the “name of the administrative unit of Parks Canada Agency that is responsible for management of the incident based on its location” (based on the description provided in the “2. pca-national-human-wildlife-coexistence-header-descriptions.csv” of this data.
The “Field Unit” histogram shows that the majority of the incidents included in the dataset occurred in the “Banff”, “Jasper”, Field Units with over 20000 incidents, with the “Lake Louise, Yoho, and Kootenay Field Unit” being next highest but with significantly less incidents at just under 10,000.
Looking at the Canadian National Parks (i.e. “Protected Heritage Area”), “Banff” and “Jasper” National Parks of Canada have the most incidents with over 25000, and the next highest at “Waterton Lakes” with just under 5000.
Looking at Incident Types, “Human Wildlife Interactions” is the most frequent at near 50,000 and the next highest is “Rescued/Recovered/Found Wildlife” at over 10,000. These will likely be the two Incident Types I focus on in the prediction model and which are of the highest importance in investigating what causes them.
Looking at both Longitude and Latitude, you can see most incidents are occurring between 50 to 55 Latitude and -135 to -125 Longitude. Referencing those latitude and longitude values on a map, I can see that this mainly indicates the incidents are occurring around British Columbia. I am more interested in the location as indicated by the Park name rather than longitude/latitude values.
For Animal Health Status, it is obvious that there are several missing values, with the most frequent occurrence of over 30,000 incidents being nan. Next is “Healthy” with around 25,000. I will be interested in looking at the Healthy animals, but also the Dead and Injured health statuses and what factors affect that status.
Cause of Animal Health Status is not very informative at all with 60,000 missing values of nan. Removing the missing values, Collision, and Entrapment are the most prevalent, but with around or under 5000 incidents of each. It would be interesting to look more at the factors involved with those two causes.
Animal Behaviour has over 25000 nan (missing) values. Removing missing values, there are two values that are most significantly frequent: “Presence – Wildlife Exclusion Zones” and “Indifferent to People/Vehicles” with over 15000.
For the Reason for Animal Behaviour, again the most frequent occurrence is the missing values (nan) at over 40,000. Removing the missing values, the known value that is significantly the most frequent is “Habituation” with over 15,000.
Looking at Animal Attractants, missing values (nan) are the most prevalent at nearly 50,000. Removing the missing values, the known value that is highest is “Vegetation (natural)”.
Looking at Deterrents Used, missing values (nan) are the most prevalent. Removing the missing values, we can see that Noise – Voice, is the most prevalent at nearly 2500, with Presence of Officer/Person, Not Applicable, and Impact – Chalkball being the next most frequent at over 2000.
Looking at Species Common Name, the 4 most frequent are Black Bear, Elk, Grizzly Bear, and Mule Deer. Interested to look further at the Incident Type distribution across these 4 species.
#After the basic histogram above and seeing how few incidents several parks had,
#I wanted to view the exact number of incidents that occured in each park.
x = IncidentsByPark.index
y = IncidentsByPark["count"]
fig,ax = plt.subplots(figsize=(35,20))
plt.plot(x, y, label=IncidentsByPark.columns, marker="o", mew=3, linewidth=8, color = "lightseagreen")
plt.xlabel("National Park", fontweight="bold", size = 26)
plt.ylabel("Number of Incidents", fontweight="bold", size = 26)
plt.xticks(size=20, rotation="vertical", label=IncidentsByPark["count"])
plt.yticks(size=20)
plt.title("Total Number of Incidents Per National Park", fontweight="bold", size = 30)
for index in range(len(x)):
ax.text(x[index], y[index], y[index], size=26)
plt.show()
plt.figure(figsize=(12,7));
plt.plot(HealthBySpecies["Black Bear"], label="Black Bear", marker="o", mew=8, linewidth=2, color = "violet")
plt.plot(HealthBySpecies["Elk"], label="Elk", marker="o", mew=8, linewidth=2)
plt.plot(HealthBySpecies["Grizzly Bear"], label="Grizzly Bear", marker="o", mew=8, linewidth=2)
plt.plot(HealthBySpecies["Mule Deer"], label="Mule Deer", marker="o", mew=8, linewidth=2)
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Incident Type', fontweight="bold", size = 18);
plt.title('Total Number of Incidents by Health Status for 4 Most Frequent Species', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.legend(prop={"size":16});
plt.show()
plt.figure(figsize=(12,7));
plt.plot(IncBySpecies["Black Bear"], label="Black Bear", marker="o", mew=8, linewidth=2, color="violet")
plt.plot(IncBySpecies["Elk"], label="Elk", marker="o", mew=8, linewidth=2)
plt.plot(IncBySpecies["Grizzly Bear"], label="Grizzly Bear", marker="o", mew=8, linewidth=2)
plt.plot(IncBySpecies["Mule Deer"], label="Mule Deer", marker="o", mew=8, linewidth=2)
plt.ylabel('Number of Incidents', fontweight="bold", size = 18)
plt.xlabel('Incident Type', fontweight="bold", size = 18);
plt.title('Total Number of Incidents by Type for 4 Most Frequent Species', fontweight="bold", size = 22)
plt.xticks(size=16, rotation='vertical')
plt.yticks(size=16)
plt.legend(loc="upper right", prop={"size":16});
plt.show()
#Modifying color cycler so that the colors are not repeating across different parks.
plot_colors = ["mediumaquamarine", "darkcyan", "deepskyblue", "firebrick", "mediumpurple", "darkorchid", "magenta", "red", "sienna", "saddlebrown", "peru", "darkorange", "tan", "goldenrod", "gold", "darkkhaki", "olive", "yellowgreen", "olivedrab", "chartreuse", "darkseagreen", "palegreen", "forestgreen", "limegreen", "mediumspringgreen"]
plt.rcParams['axes.prop_cycle'] = plt.cycler(color=plot_colors)
plt.figure(figsize=(25,15));
plt.plot(IncidentsByTypeByPark, label=IncidentsByTypeByPark.columns, marker="o", mew=8, linewidth=4);
plt.xlabel("National Park", fontweight="bold", size = 22);
plt.ylabel("Number of Incidents", fontweight="bold", size = 22);
plt.xticks(size=16, rotation="vertical")
plt.yticks(size=16)
plt.legend(loc="upper right", prop={"size":22});
plt.title("Number of Incidents by National Park, by Incident Type", fontweight="bold", size = 30);
plt.figure().set_figheight(20);
plt.show();
<Figure size 640x2000 with 0 Axes>
plot_colors = ["mediumaquamarine", "darkcyan", "deepskyblue", "firebrick", "mediumpurple", "darkorchid", "violet", "magenta", "deeppink", "palevioletred", "rosybrown", "lightcoral", "red", "sienna", "saddlebrown", "peru", "darkorange", "tan", "goldenrod", "gold", "darkkhaki", "olive", "yellowgreen", "olivedrab", "chartreuse", "darkseagreen", "palegreen", "forestgreen", "limegreen", "mediumspringgreen"]
plt.rcParams['axes.prop_cycle'] = plt.cycler(color=plot_colors)
#Plot elongated to be able to better view the distinct lines with lower values.
plt.figure(figsize=(15,25));
plt.plot(IncidentsByYear, label=IncidentsByYear.columns, marker="o", mew=8, linewidth=3);
plt.xlabel("Incident Date", fontweight="bold", size = 22);
plt.ylabel("Number of Incidents", fontweight="bold", size = 22);
plt.xticks(size=16)
plt.yticks(size=16)
plt.legend(loc="upper left", prop={"size":22});
plt.title("Number of Incidents Per Year, by Incident Type", fontweight="bold", size = 30);
plt.figure().set_figheight(20);
plt.show();
<Figure size 640x2000 with 0 Axes>
#Modifying color cycler so that the colors are not repeating across different parks.
plot_colors = ["mediumaquamarine", "mediumturquoise", "darkcyan", "deepskyblue", "steelblue", "mediumpurple", "darkorchid", "violet", "magenta", "deeppink", "palevioletred", "rosybrown", "lightcoral", "firebrick", "red", "sienna", "saddlebrown", "peru", "darkorange", "tan", "goldenrod", "gold", "darkkhaki", "olive", "yellowgreen", "olivedrab", "chartreuse", "darkseagreen", "palegreen", "forestgreen", "limegreen", "mediumspringgreen"]
plt.rcParams['axes.prop_cycle'] = plt.cycler(color=plot_colors)
plt.figure(figsize=(15,25));
plt.plot(IncidentsByYearByPark, label=IncidentsByYearByPark.columns, marker="o", mew=8, linewidth=4);
plt.xlabel("Incident Date", fontweight="bold", size = 22);
plt.ylabel("Number of Incidents", fontweight="bold", size = 22);
plt.xticks(size=16)
plt.yticks(size=16)
plt.legend(loc="upper left", prop={"size":16});
plt.title("Number of Incidents Per Year, by National Park", fontweight="bold", size = 30);
plt.figure().set_figheight(20);
plt.show();
<Figure size 640x2000 with 0 Axes>
plot_colors = ["mediumaquamarine", "darkcyan", "deepskyblue", "firebrick", "mediumpurple", "darkorchid", "violet", "magenta", "deeppink", "palevioletred", "rosybrown", "lightcoral", "red", "sienna", "saddlebrown", "peru", "darkorange", "tan", "goldenrod", "gold", "darkkhaki", "olive", "yellowgreen", "olivedrab", "chartreuse", "darkseagreen", "palegreen", "forestgreen", "limegreen", "mediumspringgreen"]
plt.rcParams['axes.prop_cycle'] = plt.cycler(color=plot_colors)
plt.figure(figsize=(15,25));
plt.plot(IncidentsByMonth, label=IncidentsByYear.columns, marker="o", mew=8, linewidth=4);
plt.xlabel("Incident Date", fontweight="bold", size = 22);
plt.ylabel("Number of Incidents", fontweight="bold", size = 22);
plt.xticks(rotation="vertical", size=16)
plt.yticks(size=16)
plt.legend(loc="upper left", prop={"size":22});
plt.title("Number of Incidents Per Month of the Year, by Incident Type", fontweight="bold", size = 30);
plt.figure().set_figheight(20);
plt.show();
<Figure size 640x2000 with 0 Axes>
ANALYSIS AFTER PLOTTING
Research Question 1: What patterns can be found in location and time of year for each of the following variables: human activities, animals involved, cause, and incident type. How do these patterns differ year over year?
Answer to Research Question #2: “What incidents are the most concerning (i.e. where there is potential risk for humans or animals)?” Looking at Incident Types, “Human Wildlife Interactions” is the most frequent at near 50,000 and the next highest is “Rescued/Recovered/Found Wildlife” at over 10,000. These incident types are the most common. Rescued/Recovered/Found Wildlife will have the biggest implication/impact on animal health. Human Wildlife Interaction also has implications on humans and animals as it introduces potential risk.
#If needed to reset color cycler back to default
#import matplotlib as mpl
#from cycler import cycler
#mpl.rcParams['axes.prop_cycle'] = cycler(color='bgrcmyk')